itsCausal: Interrupted Time Series in real world data

The User’s Guide

Aurélien Sallin, Daniel Ammann, Tobias Müller

2024-05-11

Why we need interrupted time series: the “no control group” scenario

  • Problem: In many real-world scenarios, finding a credible control group to evaluate a policy is not feasible.
  • Example:. Health care policies often affect all providers simultaneously, like changes in physician compensation or changes in guidelines.
  • Challenge: Standard methods, such as before-after comparisons or interrupted time-series (ITS), may produce biased estimates because of poor prediction, poor forecasting, and simplification assumptions with panel data.

With “itsCausal”, we aim to deliver a set of recommendations for practitioners

  • Key Idea:
    • Interrupted Time Series are heavily used in real world evidence, public health, health services research, and health economics, and epidemiological research, but little guidance on how to use it.
    • Extend the limited “work-horse” model with a machine learning-based method to address the “no control group” scenario and better forecast the counterfactual
    • Tailored for panel data such as health insurance claims data, where the number of units is larger than the number of time periods.
  • How it works:
    • Use machine learners and ensemble methods to train models on pre-intervention data.
    • Forecast onto the post-intervention period (“counterfactual”).
    • Compare actual outcomes with predicted counterfactuals to estimate causal effects.
    • Aggregate results for the desired population and the desired time period.

itsCausal was born from the practical need for effective healthcare monitoring

  • Monitor the effects of population-level interventions using rich (panel) claims data.
  • Example: interventions against low-value care (Vit. D testing) in Switzerland.

Two interventions against low-value care (Vit. D testing) in Switzerland (N=209493 physicians-months)

Development of a user’s guide for itsCausal

  • Use of ML learners (random forest, gradient boosting, neural networks, catboost, lstm) with hyperparameters tuning.
  • Rolling-window approach to make forecasts post-interventions, allowing for time-invariant and time-variant predictors.
  • The effect is the difference between the observed and the forecasted value.
  • Simulations of different data-generating processes show good performance.

Simulation of an ARMA Process
  • We benchmark our method with experimental evidence:

    • Randomized controlled trial: Primary Care Physicians in Switzerland were sent an information letter combining professional norms and peer comparison feedback about vitamin D testing (Müller, Van Gestel, and Michael (2023))
    • ATE from itsCausal is within the 95% CI of the ATE from the RCT
  • We re-estimate our research on low-value care with itsCausal and find similar results.

  • We offer researchers and industry analysts in Public Health, Health Economics, and Health Services Research recommendations for the implementation of interrupted time-series.
  • We build tests and graphical representations of tests for the assumption of causal effects, especially for the “no-confounding” assumption.
  • We implement ML learners and provide guidance on how to implement them.

Conclusion

  • itsCausal: A powerful tool for estimating causal effects without a control group.
  • Benchmarking of our method with published research and experimental evidence.
  • User’s guide: Helps industry researchers for sound implementation of ITS in real world evidence, public health, health services research, and health Economics.




Identifying assumptions for the causal effect in its

The following assumptions must hold (see Cerqua, Letta, and Menchetti (2024)):

  1. There are no hidden forms of treatment leading to different potential outcomes (weak SUTVA).
  2. Additivity
  3. No anticipation and no confounding - Absence of anticipatory effects of the intervention on the covariates and the potential outcomes - Future covariates do not affect current potential outcomes - Covariates remain unaffected by the policy in the post-intervention period (post-treatment exogeneity of the covariates)
  4. Dynamic potential outcomes model: the potential outcomes absent the policy (the “counterfactual”) can be predicted using lagged values of the outcome and of the covariates.
  5. Post-intervention non-linear multi-step-ahead model: the counterfactual can be predicted for multiple periods ahead using lagged values of the outcomes until the intervention, conditional expectations of the outcome after the intervention, and the covariates.

References

Bernal, J Lopez, S Soumerai, and Antonio Gasparrini. 2018. “A Methodological Framework for Model Selection in Interrupted Time Series Studies.” Journal of Clinical Epidemiology 103: 82–91.
Brodersen, Kay H, Fabian Gallusser, Jim Koehler, Nicolas Remy, and Steven L Scott. 2015. “Inferring Causal Impact Using Bayesian Structural Time-Series Models.”
Cerqua, Augusto, Marco Letta, and Fiammetta Menchetti. 2024. “Causal Inference and Policy Evaluation Without a Control Group.” arXiv Preprint arXiv:2312.05858.
Chernozhukov, Victor, Kaspar Wüthrich, and Yinchu Zhu. 2021. “An Exact and Robust Conformal Inference Method for Counterfactual and Synthetic Controls.” Journal of the American Statistical Association 116 (536): 1849–64.
Lopez Bernal, James, Steven Cummins, and Antonio Gasparrini. 2018. “The Use of Controls in Interrupted Time Series Studies of Public Health Interventions.” International Journal of Epidemiology 47 (6): 2082–93.
Müller, Tobias, Raf Van Gestel, and Gerfin Michael. 2023. “Pairing Professional Norms and Peer Comparison Feedback to Reduce Low-Value Care: Evidence from a Randomized Controlled Trial Among Primary Care Physicians.” Working Paper.
Turner, Simon L, Amalia Karahalios, Andrew B Forbes, Monica Taljaard, Jeremy M Grimshaw, and Joanne E McKenzie. 2021. “Comparison of Six Statistical Methods for Interrupted Time Series Studies: Empirical Evaluation of 190 Published Series.” BMC Medical Research Methodology 21 (1): 134.